translated by 谷歌翻译
In classic reinforcement learning algorithms, agents make decisions at discrete and fixed time intervals. The physical duration between one decision and the next becomes a critical hyperparameter. When this duration is too short, the agent needs to make many decisions to achieve its goal, aggravating the problem's difficulty. But when this duration is too long, the agent becomes incapable of controlling the system. Physical systems, however, do not need a constant control frequency. For learning agents, it is desirable to operate with low frequency when possible and high frequency when necessary. We propose a framework called Continuous-Time Continuous-Options (CTCO), where the agent chooses options as sub-policies of variable durations. Such options are time-continuous and can interact with the system at any desired frequency providing a smooth change of actions. The empirical analysis shows that our algorithm is competitive w.r.t. other time-abstraction techniques, such as classic option learning and action repetition, and practically overcomes the difficult choice of the decision frequency.
translated by 谷歌翻译
神经网络无处不在用于教育的应用机器学习。他们在预测性能方面的普遍成功伴随着严重的弱点,缺乏决策的解释性,尤其是在以人为中心的领域中。我们实施了五种最先进的方法,用于解释黑盒机器学习模型(Lime,PermiputationShap,kernelshap,dice,CEM),并检查每种方法的优势在学生绩效预测的下游任务上,用于五个大规模开放的在线在线公开培训班。我们的实验表明,解释者的家属在与同一代表学生集的同一双向LSTM模型中相互重要性不同意。我们使用主成分分析,詹森 - 香农距离以及Spearman的等级相关性,以跨方法和课程进行定量的盘问解释。此外,我们验证了基于课程的先决条件之间的解释器表现。我们的结果得出的结论是,解释器的选择是一个重要的决定,实际上对预测结果的解释至关重要,甚至比模型的课程更重要。源代码和模型在http://github.com/epfl-ml4ed/evaluating-explainers上发布。
translated by 谷歌翻译